Immutable Stack

最近实验需要使用到immutable stack，直接复制创建新的栈时间和空间效率都十分低，这里对高效实现immutable stack以及queue进行一个简单的学习总结。由于已经有文章很好的介绍了实现思想，主要是利用链表的思想，这里就直接引用了，具体参考Immutable Stack.

Let’s start with something simple: an immutable stack.

Now, immediately I hear the objection: a stack is by its very nature something that changes. A stack is an abstract data type (ADT) with the interface

1
2
3

void Push(T t);
T Pop();
bool IsEmpty { get; }

You push stuff onto it, you pop stuff off of it, it changes. How can it be immutable?

Every time you need to make a data structure immutable, you use basically the same trick: an operation which “changes” the data structure does so by constructing a new data structure. The original data structure stays the same.

How can that possibly be efficient? Surely we’ll be allocating memory all over the place! Well, actually, in this case, no. An immutable stack is every bit as efficient as a mutable stack. Even better: in some cases, it can be considerably more efficient, as we’ll see.

Let’s start by defining an interface for our immutable structure. While we’re at it, we’ll fix a problem with the stack ADT above, namely that you cannot interrogate the stack without changing it. And we’ll make the stack enumerable just for the heck of it:

public interface IStack<T> : IEnumerable<T>
{
    IStack<T> Push(T value);
    IStack<T> Pop();
    T Peek();
    bool IsEmpty { get; }
}

Pushing and popping give you back an entirely new stack, and Peek lets you look at the top of the stack without popping it.

Now let’s think about constructing one of these things here.

Clearly if we have an existing stack we can construct a new one by pushing or popping it. But we have to start somewhere. Since every empty stack is the same, it seems sensible to have a singleton empty stack.

public sealed class Stack<T> : IStack<T>
{
    private sealed class EmptyStack : IStack<T>
    {
        public bool IsEmpty { get { return true; } }
        public T Peek() { throw new Exception("Empty stack"); }
        public IStack<T> Push(T value) { return new Stack<T>(value, this); }
        public IStack<T> Pop() { throw new Exception("Empty stack"); }
        public IEnumerator<T> GetEnumerator() { yield break; }
        IEnumerator IEnumerable.GetEnumerator() { return this.GetEnumerator(}
    }
    private static readonly EmptyStack empty = new EmptyStack();
    public static IStack<T> Empty { get { return empty; } }
    private readonly T head;
    private readonly IStack<T> tail;
    private Stack(T head, IStack<T> tail)
    {
        this.head = head;
        this.tail = tail;
    }
    public bool IsEmpty { get { return false; } }
    public T Peek() { return head; }
    public IStack<T> Pop() { return tail; }
    public IStack<T> Push(T value) { return new Stack<T>(value, this); }
    public IEnumerator<T> GetEnumerator()
    {
        for(IStack<T> stack = this; !stack.IsEmpty ; stack = stack.Pop())
            yield return stack.Peek();
    }
    IEnumerator IEnumerable.GetEnumerator() {return this.GetEnumerator();}

And now we can easily create stacks and push stuff onto them. Notice how the fact that we have immutability means that stacks with the same tail can share state, saving on memory:

IStack<int> s1 = Stack<int>.Empty;
IStack<int> s2 = s1.Push(10);
IStack<int> s3 = s2.Push(20);
IStack<int> s4 = s2.Push(30); // shares its tail with s3.

Immutable Queue

import java.util.NoSuchElementException;
import java.util.Queue;

public class PersistentQueue<E> {
	
	private static class PersistentStack<E> {
		private E head;
		private PersistentStack<E> tail;
		private int size;

		private PersistentStack(E obj, PersistentStack<E> tail) {
			this.head = obj;
			this.tail = tail;
			this.size = tail.size + 1;
		}

		public static PersistentStack emptyStack() {
			return new PersistentStack();
		}

		private PersistentStack() {
			this.head = null;
			this.tail = null;
			this.size = 0;
		}

		public PersistentStack<E> toReverseStack() {
			PersistentStack<E> stack = new PersistentStack<E>();
			PersistentStack<E> tail = this;
			while (!tail.isEmpty()) {
				stack = stack.push(tail.head);
				tail = tail.tail;
			}
			return stack;
		}

		public boolean isEmpty() {
			return this.size == 0;
		}

		public PersistentStack<E> push(E obj) {
			return new PersistentStack<E>(obj, this);
		}
	}

	private PersistentStack<E> order;
	private PersistentStack<E> reverse;

	
	private PersistentQueue(PersistentStack<E> order, PersistentStack<E> reverse) {
		this.order = order;
		this.reverse = reverse;
	}
	
	public PersistentQueue() {
		this.order = PersistentStack.emptyStack();
		this.reverse = PersistentStack.emptyStack();
	}

	public PersistentQueue<E> enqueue(E e) {
		if (null == e)
			throw new IllegalArgumentException();
		return new PersistentQueue<E>(this.order.push(e), this.reverse);
	}

	public PersistentQueue<E> dequeue() {
		if (this.isEmpty())
			throw new NoSuchElementException();
		if (!this.reverse.isEmpty()) {
			return new PersistentQueue<E>(this.order, this.reverse.tail);
		} else {
			// revers the ordered stack then "clean" that stack
			return new PersistentQueue<E>(PersistentStack.emptyStack(),
					this.order.toReverseStack().tail);
		}
	}

	private void balanceQueue() {
		this.reverse = this.order.toReverseStack();
		this.order = PersistentStack.emptyStack();
	}

	public E peek() {
		if (this.isEmpty())
			throw new NoSuchElementException();
		if (this.reverse.isEmpty())
			balanceQueue();
		return this.reverse.head;
	}

	public boolean isEmpty() {
		return size() == 0;
	}

	public int size() {
		return this.order.size + this.reverse.size;
	}
}

Creating a simple immutable stack

Last time, I explained the basic meaning of immutability. The simplest useful example of an immutable class is an immutable stack. Immutable stacks work just like regular stacks – with Push(), Pop(), and Peek() methods – except that instead of mutating the original instance, Push() and Pop() return a new, modified, instance.

In code, that looks like:

public interface IStack<T> {
    IStack<T> Push(T element);
    IStack<T> Pop();
    T Peek();
    bool IsEmpty { get; }
}

IStack<int> myStack = empty;
myStack = myStack.Push(1).Push(2).Push(3);

while (!myStack.IsEmpty) {
    Console.WriteLine(myStack.Peek());
    myStack = myStack.Pop();
}

Each implementation of this interface would supply a singleton empty instance; since they are immutable; there is no need to have more than one empty instance. Thus, there is no need for a constructor. Since Pop() needs to return the new stack, it cannot return the removed item; therefore, Peek() is the only way to get the item.

As a side benefit, this pattern naturally enables and encourages fluent interfaces, since each mutation methods returns a new instance. (eg, myStack.Push(1).Push(2).Push(3))

The naive implementation of this interface would use a list, and copy the list when creating a new stack:

public class NaiveStack<T> : IStack<T> {
    private readonly ReadOnlyCollection<T> items;
    private NaiveStack(IList<T> items) {
        this.items = new ReadOnlyCollection<T>(items);
    }

    public static readonly NaiveStack<T> Empty 
        = new NaiveStack<T>(new T[0]);

    public IStack<T> Push(T element) {
        var list = new List<T>(items);
        list.Add(element);
        return new NaiveStack<T>(list);
    }

    public IStack<T> Pop() {
        if (IsEmpty)
            throw new InvalidOperationException("Stack is empty");
        var list = new List<T>(items);
        list.RemoveAt(list.Count - 1);
        return new NaiveStack<T>(list);
    }

    public T Peek() { return items.Last(); }
    public bool IsEmpty { 
        get { return items.Count == 0; } 
    }
}

The problem with this approach is that each mutation requires an O(n) copy to create the list behind the new instance. This makes Push() and Pop() awfully slow, and vastly increases the memory footprint until the intermediate instances can be GCd.

To solve these problems, we can design the stack as a persistent data structure. Instead of copying anything when pushing on to a stack, we cab make the new stack instance hold only the newly added item, and maintain a reference to the previous instance storing the rest of the items. Popping from the stack can then simply return the previous instance. Basically, the stack becomes an immutable single-linked list.

public abstract class PersistentStack<T> : IStack<T> {
    public static readonly PersistentStack<T> Empty = new EmptyNode();

    public IStack<T> Push(T element) {
        return new LinkNode(this, element);
    }

    private class EmptyNode : PersistentStack<T> {
        public override IStack<T> Pop() {
            throw new InvalidOperationException("Stack is empty");
        }
        public override T Peek() { 
            throw new InvalidOperationException("Stack is empty");
        }
        public override bool IsEmpty { get { return true; } }
    }

    private class LinkNode : PersistentStack<T> {
        readonly IStack<T> previous;
        readonly T element;

        public LinkNode(IStack<T> previous, T element) {
            this.previous = previous;
            this.element = element;
        }

        public override IStack<T> Pop() {
            return previous;
        }
        public override T Peek() { return element; }
        public override bool IsEmpty { get { return false; } }
    }
    public abstract IStack<T> Pop();
    public abstract T Peek();
    public abstract bool IsEmpty { get; }
}

In this implementation, the empty and non-empty nodes are different enough that I decided to use polymorphism rather than if statements. The PersistentStack type contains the only common logic (Push()); everything else is implement by the two concrete classes.

Here, the empty stack truly is a singleton; only one instance will ever be created for each element type. It can only be used as a starting point to create new stacks; the other methods simply throw exceptions. It also serves as the final node in the chain for all non-empty stacks.

Pushing onto any stack (empty or not) returns a new node that holds the new item and points to the previous stack; popping a non-empty stack simply returns that reference.

Like other linked lists, this approach implements both Push() and Pop() in O(1) in both memory and time.